Markov Random Field Models to Extract The Layout of Complex Handwritten Documents
نویسندگان
چکیده
We consider in this paper the problem of complex handwritten page segmentation such as novelist drafts or authorial manuscripts. We propose to use stochastic and contextual models in order to cope with local spatial variability, and to take into account some prior knowledge about the global structure of the document image. The models we propose to use are Markov Random Field models. Using this model, the segmentation is performed using optimization techniques. Using the MRF framework, the segmentation is equivalent to an image labeling problem and is performed using optimization techniques.
منابع مشابه
Context modeling for text/non-text separation in free-form online handwritten documents
Free-form online handwritten documents contain a high diversity of content, organized without constraints imposed to the user. The lack of prior knowledge about content and layout makes the modeling of contextual information of crucial importance for interpretation of such documents. In this work, we present a comprehensive investigation of the sources of contextual information that can benefit...
متن کاملNumerical Field Extraction in Handwritten Incoming Mail Documents
In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label of each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize...
متن کاملApproche markovienne bidimensionnelle d'analyse et de reconnaissance de documents manuscrits
In this thesis, we present a general bidimensional markovian approach in theframework of handwritten document analysis and recognition. This approachcalled AMBRES (Bidimensional Markovian Approach for image Recognition andSegmentation) is based on Markov random elds, 2D dynamic programming anda bidimensional analysis of images.AMBRES has been successfully applied to a wi...
متن کاملNumerical Sequence Extraction in Handwritten Incoming Mail Documents
In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label to each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize...
متن کاملNatural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents
Document layout analysis is an important task needed for handwritten text recognition among other applications. Text layout commonly found in handwritten legacy documents is in the form of one or more paragraphs composed of parallel text lines. An approach for handwritten text line detection is presented which uses machinelearning techniques and methods widely used in natural language processin...
متن کامل